Search Results

Documents authored by Carvalho, João Paulo


Document
Detecting a Tweet’s Topic within a Large Number of Portuguese Twitter Trends

Authors: Hugo Rosa, João Paulo Carvalho, and Fernando Batista

Published in: OASIcs, Volume 38, 3rd Symposium on Languages, Applications and Technologies (2014)


Abstract
In this paper we propose to approach the subject of Twitter Topic Detection when in the presence of a large number of trending topics. We use a new technique, called Twitter Topic Fuzzy Fingerprints, and compare it with two popular text classification techniques, Support Vector Machines (SVM) and k-Nearest Neighbours (kNN). Preliminary results show that it outperforms the other two techniques, while still being much faster, which is an essential feature when processing large volumes of streaming data. We focused on a data set of Portuguese language tweets and the respective top trends as indicated by Twitter.

Cite as

Hugo Rosa, João Paulo Carvalho, and Fernando Batista. Detecting a Tweet’s Topic within a Large Number of Portuguese Twitter Trends. In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 185-199, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)


Copy BibTex To Clipboard

@InProceedings{rosa_et_al:OASIcs.SLATE.2014.185,
  author =	{Rosa, Hugo and Carvalho, Jo\~{a}o Paulo and Batista, Fernando},
  title =	{{Detecting a Tweet’s Topic within a Large Number of Portuguese Twitter Trends}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{185--199},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.185},
  URN =		{urn:nbn:de:0030-drops-45696},
  doi =		{10.4230/OASIcs.SLATE.2014.185},
  annote =	{Keywords: topic detection, social networks data mining, Twitter, Portuguese language}
}
Document
Expanding a Database of Portuguese Tweets

Authors: Gaspar Brogueira, Fernando Batista, João Paulo Carvalho, and Helena Moniz

Published in: OASIcs, Volume 38, 3rd Symposium on Languages, Applications and Technologies (2014)


Abstract
This paper describes an existing database of geolocated tweets that were produced in Portuguese regions and proposes an approach to further expand it. The existing database covers eight consecutive days of collected tweets, totaling about 300 thousand tweets, produced by about 11 thousand different users. A detailed analysis on the content of the messages suggests a predominance of young authors that use Twitter as a way of reaching their colleagues with their feelings, ideas and comments. In order to further characterize this community of young people, we propose a method for retrieving additional tweets produced by the same set of authors already in the database. Our goal is to further extend the knowledge about each user of this community, making it possible to automatically characterize each user by the content he/she produces, cluster users and open other possibilities in the scope of social analysis.

Cite as

Gaspar Brogueira, Fernando Batista, João Paulo Carvalho, and Helena Moniz. Expanding a Database of Portuguese Tweets. In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 275-282, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)


Copy BibTex To Clipboard

@InProceedings{brogueira_et_al:OASIcs.SLATE.2014.275,
  author =	{Brogueira, Gaspar and Batista, Fernando and Carvalho, Jo\~{a}o Paulo and Moniz, Helena},
  title =	{{Expanding a Database of Portuguese Tweets}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{275--282},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.275},
  URN =		{urn:nbn:de:0030-drops-45763},
  doi =		{10.4230/OASIcs.SLATE.2014.275},
  annote =	{Keywords: Twitter, corpus of Portuguese tweets, Twitter API, natural language processing, text analysis}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail